Strategies for rescoring keyword search results using word-burst and acoustic features

نویسندگان

  • Min Ma
  • Justin Richards
  • Victor Soto
  • Julia Hirschberg
  • Andrew Rosenberg
چکیده

The identification of keyword queries in speech data from lowresources languages poses a challenge for current methods as speech recognition algorithms lack sufficient training data to produce high accuracy transcript. To compensate for these shortcomings, we extract signals from the data that are useful in keyword identification but are not being used by the speech recognizer. These signals take multiple forms — word burstiness, rescored confusion network posteriors and acoustic/prosodic qualities. The former denotes the tendency for keywords to occur in bursts within a conversational topic. We employ three different strategies to exploit this information: 1) a four-way classification of keyword hypotheses that targets low-scoring correct hits and high-scoring false alarms, 2) ranking algorithms, and 3) a direct adjustment of keyword hit scores based on hypothesized repetition. We find that interpolating the results of these three strategies in an ensemble provides a reliable way to improve the results of keyword search.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken Keyword Rescoring and Document Retrieval for Low-resource Languages

For languages that have adequate data for automatic speech recognition (ASR), many keyword search(KWS) and document retrieval(SDR) systems have been developed with near-optimal performance. However, lacking of sufficient training data to produce high accuracy transcript, identification and retrieval of queries in speech data from low-resources languages remains challenging. To compensate for th...

متن کامل

Task dependent loss functions in speech recognition: a* search over recognition lattices

A recognition strategy that can be matched to specific system performance criteria such as word error rate or F-measure has recently been found to yield improvements over the usual maximum a-posteriori probability strategy [1] [2] [3]. In this matched-to-the-task strategy a hypothesis is chosen to minimize the expected loss or the Bayes Risk under a loss function defined by a performance measur...

متن کامل

Task Dependent Loss Functions in Speech Recognition: Search over Recognition Lattices

A recognition strategy that can be matched to specific system performance criteria such as word error rate or F-measure has recently been found to yield improvements over the usual maximum a-posteriori probability strategy [1] [2] [3]. In this matched-to-the-task strategy a hypothesis is chosen to minimize the expected loss or the Bayes Risk under a loss function defined by a performance measur...

متن کامل

Direct word graph rescoring using a* search and RNNLM

The usage of Recurrent Neural Network Language Models (RNNLMs) has allowed reaching significant improvements in Automatic Speech Recognition (ASR) tasks. However, to take advantage of their capability for considering long histories, they are usually used to rescore the N-best lists (i.e. it is in practice not possible to use them directly during acoustic trellis search). We propose in this pape...

متن کامل

Echolocation: Using Word-Burst Analysis to Rescore Keyword Search Candidates in Low-Resource Languages

ECHOLOCATION: USING WORD-BURST ANALYSIS TO RESCORE KEYWORD SEARCH CANDIDATES IN LOW-RESOURCE LANGUAGES

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014